AITopics | statistical bias

Collaborating Authors

statistical bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9961e42624a6c083279303767c73269d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 21:18:54 GMT

artificial intelligence, ece, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense

Huang, Yiyang, Shi, Liang, Zhang, Yitian, Xu, Yi, Fu, Yun

arXiv.org Artificial IntelligenceOct-21-2025

Large Vision-Language Models (LVLMs) excel in diverse cross-modal tasks. However, object hallucination, where models produce plausible but inaccurate object descriptions, remains a significant challenge. In contrast to previous work focusing on LLM components, this paper is the first to trace LVLM hallucinations to visual encoders and identifies three key issues: statistical bias, inherent bias, and vulnerability. To address these challenges, we propose SHIELD, a training-free framework that mitigates hallucinations through three strategies: re-weighting visual tokens to reduce statistical bias, introducing noise-derived tokens to counter inherent bias, and applying adversarial attacks with contrastive decoding to address vulnerability. Experiments demonstrate that SHIELD effectively mitigates object hallucinations across diverse benchmarks and LVLM families. Moreover, SHIELD achieves strong performance on the general LVLM benchmark, highlighting its broad applicability. Code will be released.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.16596

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.46)
Information Technology > Security & Privacy (0.34)
Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)

Add feedback

Information-Theoretic Generalization Analysis for Expected Calibration Error

Neural Information Processing SystemsOct-10-2025, 10:50:13 GMT

While the expected calibration error (ECE), which employs binning, is widely adopted to evaluate the calibration performance of machine learning models, theoretical understanding of its estimation bias is limited. In this paper, we present the first comprehensive analysis of the estimation bias in the two common binning strategies, uniform mass and uniform width binning . Our analysis establishes upper bounds on the bias, achieving an improved convergence rate. Moreover, our bounds reveal, for the first time, the optimal number of bins to minimize the estimation bias. We further extend our bias analysis to generalization error analysis based on the information-theoretic approach, deriving upper bounds that enable the numerical evaluation of how small the ECE is for unknown data. Experiments using deep learning models show that our bounds are nonvacuous thanks to this information-theoretic generalization analysis approach.

bin, ece, tce, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Fuck the Algorithm: Conceptual Issues in Algorithmic Bias

Stinson, Catherine

arXiv.org Artificial IntelligenceMay-21-2025

Algorithmic bias has been the subject of much recent controversy. To clarify what is at stake and to make progress resolving the controversy, a better understanding of the concepts involved would be helpful. The discussion here focuses on the disputed claim that algorithms themselves cannot be biased. To clarify this claim we need to know what kind of thing 'algorithms themselves' are, and to disambiguate the several meanings of 'bias' at play. This further involves showing how bias of moral import can result from statistical biases, and drawing connections to previous conceptual work about political artifacts and oppressive things. Data bias has been identified in domains like hiring, policing and medicine. Examples where algorithms themselves have been pinpointed as the locus of bias include recommender systems that influence media consumption, academic search engines that influence citation patterns, and the 2020 UK algorithmically-moderated A-level grades. Recognition that algorithms are a kind of thing that can be biased is key to making decisions about responsibility for harm, and preventing algorithmically mediated discrimination.

artificial intelligence, information management, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.13509

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Ontario > Kingston (0.04)
Oceania > New Zealand > South Island > Canterbury > Christchurch (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Education > Educational Setting (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.88)

Add feedback

Generalization vs. Memorization in the Presence of Statistical Biases in Transformers

Mitros, John

arXiv.org Machine LearningSep-6-2024

This study aims to understand how statistical biases affect the model's ability to generalize to in-distribution and out-of-distribution data on algorithmic tasks. Prior research indicates that transformers may inadvertently learn to rely on these spurious correlations, leading to an overestimation of their generalization capabilities. To investigate this, we evaluate transformer models on several synthetic algorithmic tasks, systematically introducing and varying the presence of these biases. We also analyze how different components of the transformer models impact their generalization. Our findings suggest that statistical biases impair the model's performance on out-of-distribution data, providing a overestimation of its generalization capabilities. The models rely heavily on these spurious correlations for inference, as indicated by their performance on tasks including such biases.

memorization, statistical bias, transformer

arXiv.org Machine Learning

2409.04654

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.40)

Add feedback

JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models

Wang, Ze, Wu, Zekun, Guan, Xin, Thaler, Michael, Koshiyama, Adriano, Lu, Skylar, Beepath, Sachin, Ertekin, Ediz Jr., Perez-Ortiz, Maria

arXiv.org Artificial IntelligenceJun-17-2024

This paper presents a novel framework for benchmarking hierarchical gender hiring bias in Large Language Models (LLMs) for resume scoring, revealing significant issues of reverse bias and overdebiasing. Our contributions are fourfold: First, we introduce a framework using a real, anonymized resume dataset from the Healthcare, Finance, and Construction industries, meticulously used to avoid confounding factors. It evaluates gender hiring biases across hierarchical levels, including Level bias, Spread bias, Taste-based bias, and Statistical bias. This framework can be generalized to other social traits and tasks easily. Second, we propose novel statistical and computational hiring bias metrics based on a counterfactual approach, including Rank After Scoring (RAS), Rank-based Impact Ratio, Permutation Test-Based Metrics, and Fixed Effects Model-based Metrics. These metrics, rooted in labor economics, NLP, and law, enable holistic evaluation of hiring biases. Third, we analyze hiring biases in ten state-of-the-art LLMs. Six out of ten LLMs show significant biases against males in healthcare and finance. An industry-effect regression reveals that the healthcare industry is the most biased against males. GPT-4o and GPT-3.5 are the most biased models, showing significant bias in all three industries. Conversely, Gemini-1.5-Pro, Llama3-8b-Instruct, and Llama3-70b-Instruct are the least biased. The hiring bias of all LLMs, except for Llama3-8b-Instruct and Claude-3-Sonnet, remains consistent regardless of random expansion or reduction of resume content. Finally, we offer a user-friendly demo to facilitate adoption and practical application of the framework.

applicant, gemini-1, llm, (17 more...)

arXiv.org Artificial Intelligence

2406.15484

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.95)

Industry:

Law (1.00)
Health & Medicine (1.00)
Information Technology (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rating Multi-Modal Time-Series Forecasting Models (MM-TSFM) for Robustness Through a Causal Lens

Lakkaraju, Kausik, Kaur, Rachneet, Zeng, Zhen, Zehtabi, Parisa, Patra, Sunandita, Srivastava, Biplav, Valtorta, Marco

arXiv.org Machine LearningJun-12-2024

AI systems are notorious for their fragility; minor input changes can potentially cause major output swings. When such systems are deployed in critical areas like finance, the consequences of their uncertain behavior could be severe. In this paper, we focus on multi-modal time-series forecasting, where imprecision due to noisy or incorrect data can lead to erroneous predictions, impacting stakeholders such as analysts, investors, and traders. Recently, it has been shown that beyond numeric data, graphical transformations can be used with advanced visual models to achieve better performance. In this context, we introduce a rating methodology to assess the robustness of Multi-Modal Time-Series Forecasting Models (MM-TSFM) through causal analysis, which helps us understand and quantify the isolated impact of various attributes on the forecasting accuracy of MM-TSFM. We apply our novel rating method on a variety of numeric and multi-modal forecasting models in a large experimental setup (six input settings of control and perturbations, ten data distributions, time series from six leading stocks in three industries over a year of data, and five time-series forecasters) to draw insights on robust forecasting models and the context of their strengths. Within the scope of our study, our main result is that multi-modal (numeric + visual) forecasting, which was found to be more accurate than numeric forecasting in previous studies, can also be more robust in diverse settings. Our work will help different stakeholders of time-series forecasting understand the models` behaviors along trust (robustness) and accuracy dimensions to select an appropriate model for forecasting using our rating method, leading to improved decision-making.

mm-tsfm, perturbation, stock price, (15 more...)

arXiv.org Machine Learning

2406.12908

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > Greenland (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.93)
Banking & Finance > Trading (0.90)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Information-theoretic Generalization Analysis for Expected Calibration Error

Futami, Futoshi, Fujisawa, Masahiro

arXiv.org Machine LearningMay-24-2024

While the expected calibration error (ECE), which employs binning, is widely adopted to evaluate the calibration performance of machine learning models, theoretical understanding of its estimation bias is limited. In this paper, we present the first comprehensive analysis of the estimation bias in the two common binning strategies, uniform mass and uniform width binning. Our analysis establishes upper bounds on the bias, achieving an improved convergence rate. Moreover, our bounds reveal, for the first time, the optimal number of bins to minimize the estimation bias. We further extend our bias analysis to generalization error analysis based on the information-theoretic approach, deriving upper bounds that enable the numerical evaluation of how small the ECE is for unknown data. Experiments using deep learning models show that our bounds are nonvacuous thanks to this information-theoretic generalization analysis approach.

bin, ece, tce, (15 more...)

arXiv.org Machine Learning

2405.15709

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

On Fairness and Stability: Is Estimator Variance a Friend or a Foe?

Khan, Falaah Arif, Herasymuk, Denys, Stoyanovich, Julia

arXiv.org Artificial IntelligenceFeb-9-2023

The error of an estimator can be decomposed into a (statistical) bias term, a variance term, and an irreducible noise term. When we do bias analysis, formally we are asking the question: "how good are the predictions?" The role of bias in the error decomposition is clear: if we trust the labels/targets, then we would want the estimator to have as low bias as possible, in order to minimize error. Fair machine learning is concerned with the question: "Are the predictions equally good for different demographic/social groups?" This has naturally led to a variety of fairness metrics that compare some measure of statistical bias on subsets corresponding to socially privileged and socially disadvantaged groups. In this paper we propose a new family of performance measures based on group-wise parity in variance. We demonstrate when group-wise statistical bias analysis gives an incomplete picture, and what group-wise variance analysis can tell us in settings that differ in the magnitude of statistical bias. We develop and release an open-source library that reconciles uncertainty quantification techniques with fairness analysis, and use it to conduct an extensive empirical analysis of our variance-based fairness measures on standard benchmarks.

artificial intelligence, machine learning, variance, (14 more...)

arXiv.org Artificial Intelligence

2302.04525

Country:

North America > United States > Illinois > Cook County > Chicago (0.06)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

An "Unbiased" Guide to Bias in AI

#artificialintelligenceDec-15-2022, 10:50:13 GMT

Whenever there is any mention of ethics in the context of AI, the topic of bias & fairness often follows. Similarly, whenever there is any mention of training and testing machine learning models, the trade-off between bias & variance features heavily. But do these two mentions of bias refer to the same thing? In order for machines to learn these patterns, especially in "supervised learning", they go through a training process whereby an algorithm extracts patterns from a training dataset, typically in an iterative manner. It then tests its predictions on an unseen (out-of-sample) test dataset to validate if the patterns it had learnt from the training dataset are valid. Bias: The action of supporting or opposing a particular person or thing in an unfair way, because of allowing personal opinions to influence your judgment.

artificial intelligence, ethical bias, machine learning, (19 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Law (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback